HOW TO CREATE A LIST OF COMPANY HOMEPAGES THAT ARE LISTED IN CATALOG http://dir.yahoo.com/Business_and_Economy/Companies/Electronic_Commerce

This is only an example of using P.S.E. You can create your own profiles to perform more difficult searches. You can search mail addresses, telephone numbers or street addresses. Modify this example to get most usable result. The future version of P.S.E will be able to automatically perform actions like to send mail by all found addresses or to download all found pages.

The first step is to define resource that contains a list of companies. For example: http://dir.yahoo.com/Business_and_Economy/Companies/Electronic_Commerce. Let us create a search profile that will read this document and make a database.

Follow next steps:

  • Press "New" button.
  • Remove all existing property templates.
  • Company information string includes company name, homepage address and brief comment. Let us create a 3 property templates that make program to read company data:
    1. Mask "Company address".
      Value: \w-<li><a href="\w+\*\w-">
      Parent property: None
      Type: "Address"
      Translate format: FALSE
      Maximum found: Not test
    2. Mask "Company Name".
      Value: \*\w-</a>
      Parent property: "Company address".
      Depend type: "Search after"
      Type: "Text"
      Translate format: TRUE
      Maximum found: 1
    3. Mask "Comments".
      Value: \*\w-<li>
      Parent property: "Company Name".
      Depend type: "Search after"
      Type: "Text"
      Translate format: TRUE
      Maximum found: 1
  • Set up start address http://dir.yahoo.com/Business_and_Economy/Companies/Electronic_Commerce
  • Press "Start" button.
  • Open panel "Results" and press button "Report values". List of companies will appear. You can save it to disk and use by any way, for instance, to download all the pages automatically.

  • Home